Ring-oriented Block Matrix Factorization Algorithms for Shared and Distributed Memory Architectures

نویسندگان

Krister Dackland

Erik Elmroth

چکیده

Utilizing experiences from the implementations on shared memory multiprocessors (SMM) and distributed memory multicomputers (DMM), general ring-oriented routines are developed for the LU, Cholesky, and QR factorizations. Since, all machine dependencies are comprised to a small set of communication routines, the same factorization routines can be used on both the SMM and DMM architectures. The algorithms are described on high level with focus on the porta-bility aspects. Further, detailed implementations of the LU factor-ization and machine speciic communication routines for the Alliant FX2816, Intel iPSC/2, and IBM 3090VF/600J are enclosed. Timing results show that the performance of machine speciic implementations are preserved for the general ring-oriented block algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Ring-Oriented Approach for Block Matrix Factorizations on Shared and Distributed Memory Architectures

A block (column) wrap-mapping approach for design of parallel block matrix factorization algorithms that are (trans)portable over and between shared memory multiprocessors (SMM) and distributed memory multicomputers (DMM) is presented. By reorganizing the matrix on the SMM architecture, the same ring-oriented algorithms can be used on both SMM and DMM systems with all machine dependencies compr...

متن کامل

Design and Performance Modeling of Parallel Block Matrix Factorizations for Distributed Memory Multicomputers

EEcient and scalable parallel block algorithms for the LU factorization with partial pivoting, the Cholesky, and QR factorizations in a distributed memory multicomputer environment are presented. The distributed system is viewed as a ring of processors and the algorithms correspond to shared memory algorithms parallelized on block level (explicit parallelism). Performance of the algorithms are ...

متن کامل

Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures

To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist in scheduling a Directed Acyclic Graph (DAG) of tasks of fine granularity where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on ...

متن کامل

Enhancing Parallelism of Tile QR Factorization for Multicore Architectures

To exploit the potential of multicore architectures, recent dense linear algebra libraries have used tile algorithms, which consist of scheduling a Directed Acyclic Graph (DAG) of fine granularity tasks where nodes represent tasks, either panel factorization or update of a block-column, and edges represent dependencies among them. Although past approaches already achieve high performance on mod...

متن کامل

High-performance and Parallel Inversion of a Symmetric Positive Definite Matrix

We present families of algorithms for operations related to the computation of the inverse of a Symmetric Positive Definite (SPD) matrix: Cholesky factorization, inversion of a triangular matrix, multiplication of a triangular matrix by its transpose, and one-sweep inversion of an SPD matrix. These algorithms are systematically derived and implemented via the Formal Linear Algebra Methodology E...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1992

Ring-oriented Block Matrix Factorization Algorithms for Shared and Distributed Memory Architectures

نویسندگان

چکیده

منابع مشابه

A Ring-Oriented Approach for Block Matrix Factorizations on Shared and Distributed Memory Architectures

Design and Performance Modeling of Parallel Block Matrix Factorizations for Distributed Memory Multicomputers

Tall and Skinny QR Matrix Factorization Using Tile Algorithms on Multicore Architectures

Enhancing Parallelism of Tile QR Factorization for Multicore Architectures

High-performance and Parallel Inversion of a Symmetric Positive Definite Matrix

عنوان ژورنال:

اشتراک گذاری